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system  described  by  a certain  type  of  Volterra  series  expansion  or  by  a bilinear 
or  state-linear  system  satisfying  certain  algebraic  conditions.  The  purpose  in 
this  paper  is  to  consider  estimation  problems  similar  to  those  presented  before, 
to  present  simpler  proofs  that  the  estimators  are  indeed  finite  dimensional, 
to  provide  deeper  insight  into  these  problems  by  relating  them  to  the 
homogeneous  chaos  of  Wiener  and  to  orthogonal  polynomial  expansions,  to  explain 
the  similarities  and  differences  between  the  continuous  and  discrete  time 
cases,  and  to  prove  some  extensions  of  previous  results.  The  existence  of 
polynomials  in  the  innovations  in  the  discrete  time  recursive  estimator,  in 
contrast  to  the  continuous  time  estimator,  is  interpreted  in  terms  of  the 
homogeneous  chaos.  The  existence  of  such  polynomials  in  the  innovations  in 
the  optimal  filter"fcuggests  that  suboptimal  filter  design  in  discrete  time 
could  be  improved  byi incorporating  such  structure;  this  is  in  contrast  to 
most  discrete  time  estimator  designs,  such  as  the  extended  Kalman  filter,  in 
which  the  updated  estimate  is  linear  in  the  innovations  and  the  higher 
measurement  space  filteA- 
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1.  Introduction 

In  [l]-[3]  we  have  shown  that,  for  certain  classes  of  nonlinear 
stochastic  systems  in  both  continuous  and  discrete  time,  the  optimal 
conditional  mean  estimator  of  the  system  state  given  the  past  observations 
can  be  computed  with  a recursive  filter  of  fixed  finite  dimension.  The 
typical  nonlinear  system  in  these  classes  consists  of  a linear  system  with 
linear  measurements  and  white  Gaussian  noise  processes,  which  feeds  forward 
into  a nonlinear  system  described  by  a certain  type  of  Volterra  series 
expansion  or  by  a bilinear  or  state-linear  system  satisfying  certain 
algebraic  conditions.  It  is  our  purpose  in  this  paper  to  consider  estimation 
problems  similar  to  those  in  [l]-[3],  to  present  simpler  proofs  that  the 
estimators  are  indeed  finite  dimensional,  to  provide  deeper  insight  into  - 
these  problems  by  relating  them  to  the  homogeneous  chaos  of  Wiener  and  to 
orthogonal  polynomial  expansions  [ 4 1 — [ 8 ] , {24] , to  explain  the 
similarities  and  differences  between  the  continuous  and  discrete  time  cases, 
and  to  prove  some  extensions  of  our  previous  results.  The  existence  of 
polynomials  in  the  innovations  in  the  discrete  time  recursive  estimator,  in 
contrast  to  the  continuous  time  estimator  (as  noted  in  [2]),  is  interpreted 
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in  terms  of  the  homogeneous  chaos.  The  existence  of  such  polynomials  in 
the  innovations  in  the  optimal  filter  suggests  that  suboptimal  filter 
design  in  discrete  time  could  be  improved  by  incorporating  such  structure; 
this  is  in  contrast  to  most  discrete  time  estimator  designs,  such  as  the 
extended  Kalman  filter,  in  which  the  updated  estimate  is  linear  in  the 
innovations  (exceptions  are  the  quasi-moment  estimators  of  [9]  and  [10]) 
and  the  higher  measurement  space  filter  of  [23], 

2.  Problem  Statement 

As  in  [1]— [3] , the  classes  of  systems  considered  in  this  paper  are 
described  as  follows.  It  will  be  assumed  that  all  random  variables  and 
processes  are  defined  on  a probability  space  (fl,B,P).  In  continuous  time, 
we  consider  systems  of  the  form,  for  te[0,T], 


dx(t)  = A(t)x(t)dt  + B(t)dw(t) 

(1) 

dy(t)  = f (x(t) ,y(t) ,t)dt 

(2) 

j, 

dz^(t)  = C(t)x(t)dt  + R2dv(t) 

(3) 

where  x(t)e  Rn,  y(t)e  »m,  z(t)e  1RP,  w and  v are  standard  vector  Wiener 
processes,  R > 0,  x(0)  is  Gaussian,  {x(0) ,y (0) ,w(t) ,v(s) } are  independent 
for  all  t and  s,  f is  an  analytic  function  of  x and  y,  and  [A(t) ,B(t) ,C(t) ] 
is  completely  controllable  and  observable. 

The  discrete  time  systems  to  be  considered  are  of  the  form,  for 
tc{0;T}, 


x(t+l)  - A(t)x(t)  + B(t)w(t) 

(A) 

y(t+l)  - f(x(t),y(t),t) 

(5) 

z2(t)  - C(t)x(t)  + R**v ( t ) 

(6) 

where  TeZ+,  the  set  of  positive  Integers,  and  (s;t)  is  the  set  of  Integers 
(s,8+l, . . . ,t}.  The  assumptions  in  (4)-(6)  are  the  same  as  those  in  (l)-(3), 
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except  that  w(t)  and  v(t)  are  zero-mean  Gaussian  white  noise  processes. 
Motivation  for  the  study  of  systems  of  the  form  (l)-(3)  and  (4)-(6)  is 
presented  in  [3]. 

The  optimal  estimate,  with  respect  to  a wide  variety  of  criteria 
(including  minimum  mean  square  error),  of  x(t)  given  the  past  observations 
z^MZiCs),  O^s^t}  or  z^  A {z^  (s) , se{0;t}},  is  the  conditional  mean 

A | Zj  £ 

x(t|t)  of  x(t)  given  the  o-field  F^  generated  by  z^,  also  denoted 
E[x(t)|z^].  It  is  assumed  that  all  the  relevant  random  variables  are  in 
so  the  conditional  expectation  x(t|t)  can  also  be  Interpreted 
as  the  orthogonal  projection  of  x(t)  onto  the  subspace  L2^,Ft 
[14,  App.  A.];  this  interpretation  will  be  used  in  the  sequel.  Predicted 
and  smoothed  estimates  will  also  be  used  extensively,  so  we  Introduce  the 
equivalent  notations  x(s | t)  A E[x(s) | z^]  A EC [x(s) ] A E[x(s) |F*] . Thus  our 
objective  is  the  recursive  computation  of  x(tjt)  and  y(t|t).  The 
computation  of  x(t|t)  can  be  performed  by  the  recursive  n-dimensional 
(linear)  Kalman  filter  in  continuous  or  discrete  time.  It  is,  in  general, 
not  possible  to  compute  y(t|t)  with  a recursive  estimator  of  fixed  finite 
dimension.  It  has  been  proved  in  [ 1 ] — [ 3 ] that  if  the  nonlinear  system 
(2)  or  (4)  is  characterized  by  a certain  type  of  finite  series  expansion 
or  by  certain  bilinear  or  state-affine  equations,  then  y(t|t)  can  be 
computed  by  such  a recursive  finite  dimensional  estimator.  Some  of  the 
major  results  can  be  summarized  as  follows. 

Let  the  Volterra  series  expansions  for  the  1th  components  of  y(t)  in 
(2)  and  (4)  be  given  by 


yt(t)  “ w (t)  + l I •••  / l 
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respectively.  Here  is  called  a k order  kernel,  and  a finite 
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Volterra  series  expansion  of  order  q is  one  such  that  all  k order  kernels 

are  zero  for  k > q.  In  the  continuous  case  (7),  we  consider,  without  loss 

of  generality  [11],  only  triangular  kernels  which  satisfy 
(a, , • • • ,a,  ) 

ki  v"’"l k' 1 ”■  'k 

separable  if  it  can  be  expressed  as  a finite  sum 
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(9) 


Similar  definitions  can  be  made  in  discrete  time  [2],  but  they  are  more 
complicated  (this  difficulty  is  related  ro  the  fact  that  the  solution  of  a 
discrete  time  system  may  not  be  defined  backward  in  time  [12], [21]). 
Brockett  [11]  and  Gilbert  [13]  have  shown  that  the  kernels  in  (7)  are 
separable  if  f is  analytic.  Using  variational  expansions  similar  to  those 
of  Gilbert  [13],  it  is  straightforward  to  show  that  the  kernels  of  the 
Volterra  series  (8)  are  also  separable  in  the  sense  of  [2], [12];  this  is 
basically  due  to  the  fact  that  the  kernels  arise  from  the  variational 
equations  as  products  of  pulse  responses  of  linear  systems.  Brockett  [11] 
has  also  shown  that  a continuous  time  finite  Volterra  series  has  a bilinear 
realization  if  and  only  if  it  has  separable  kernels.  The  separability  and 
realizability  results  are  crucial  in  the  proofs  of  the  following  two 
theorems. 
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Theorem  1 [1]:  Consider  the  system  (l)-(3),  and  assume  that  (2)  has 


l 


a finite  Volterra  series  expansion.  Then  y(t|t)  can  be  computed  with  a 
finite  dimensional  recursive  estimator — i.e.,  by  a finite  set  of  nonlinear 
stochastic  differential  equations  driven  by  the  innovations 

t 

v.(t)  A z (t)  - / C(s)x(s | s)ds.  (10) 

x 0 

Theorem  2 [2]:  Consider  the  system  (4)- (6),  and  assume  that  (5)  has 
a finite  Volterra  series  expansion.  Then  y(t|t)  can  be  computed  with  a 
finite  set  of  nonlinear  difference  equations  driven  by  the  innovations 


v2(t)  = z2(t)  - C(t)A(t-l)x(t-l[t-l). 


(ID 


The  basic  technique  employed  in  [ 1 ] — [ 3 ] to  prove  these  theorems  is 
the  augmentation  of  the  state  of  the  original  system  with  additional  states 
which  arise  as  smoothed  statistics  of  the  original  state.  For  the  classes 
of  systems  considered  here,  it  is  shown  that  only  a finite  number  of 
additional  states  (smoothed  statistics)  are  required.  We  will  see  here, 
from  a different  point  of  view,  how  the  additional  filter  states  arise. 

In  addition,  we  will  prove  results  similar  to  Theorems  1 and  2 for  some 
systems  in  which  equations  (2)  and  (4)  for  y(t)  contain  an  additive  noise 
term. 


In  this  paper  both  the  continuous  and  discrete  time  problems  will  be 
considered  in  a unified  framework.  It  is  useful  first  to  contrast  these 
problems  with  the  estimation  and  prediction  problems  considered  by  Huang 
and  Carabanis  [8],  There  the  problem  is  that  of  estimating  a nonlinear 
functional  y of  a Gaussian  process  (x(t),  teS},  given  observations  of 
(x(t),  tcS),  where  S is  a subset  of  S.  In  our  problem  the  objective  is 
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i 
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to  recursively  estimate  a nonlinear  functional  y(t)  of  x(*),  given 
observations  of  linear  functionals  of  x(*)  plus  noise.  Although  the 
elegant  formulas  of  Huang  and  Cambanis  cannot  be  applied  here,  the 
approach  of  utilizing  the  homogeneous  chaos,  or,  equivalently,  the 
Cameron-Martin  orthogonal  series  decomposition  of  a Gaussian  process  [4]- 
[8],  will  prove  to  be  quite  useful  in  unifying  and  simplifying  our  results. 

By  employing  the  "innovations  approach"  [15], [16],  the  conditional 

expectations  y(t|t)  of  Theorems  1 and  2 can  equivalently  be  viewed  as 

projections  on  Hilbert  spaces  generated  by  the  innovations  instead  of  the 

observations.  For  the  discrete  time  problem  (4)-(6)  it  can  easily  be 

shown  recursively  that  F ^ = F^f  So  that  y(t|t)  is  the  projection  of  y(t) 
v7 

onto  I^Q, Ftz,P);  in  fact  V2(*)  is  just  obtained  from  the  Gram-Schmidt 
orthogonalization  of  the  sequence  . It  has  been  shown  [15], [16]  for 

continuous  time  Gaussian  processes  (as  in  (1),(3))  that  F - F hence, 

- 1 

y(t|t)  is  the  projection  of  y(t)  onto  ^2^’*t  The  innovations  process 

vx(t)  is  a Wiener  process  with  the  same  covariance  as  R v(t)  [ 14 ]— [ 16 ] ; 
the  innovations  process  V2(c)  *s  a zero-mean  Gaussian  white  noise  sequence 
with  E[v2(t)v2(t) ' ] = C ( t ) P ( t | t-l)C* (t)  + R,  where  P ( t | t— 1)  is  the  Kalman 
filter  one-step  error  covariance  matrix  [17],  In  both  cases,  the  linear 
and  nonlinear  innovations  are  equal.  Hence  the  estimation  problem  ( 1) — (3) 
or  (4)-(6)  can  be  reformulated  as  that  of  estimating  y(t),  a nonlinear 
I^-functional  of  the  Gaussian  process  xl;  the  estimate  y(t|t)  is  the 
nonlinear  I^-functional  of  the  innovations  process  (either  v^  or  v^)  which 
minimizes  the  mean  square  error.  The  expansion  of  such  Lj-functionals  of 
Gaussian  processes  is  the  subject  of  [4]-[8],  and  the  application  of  these 
results  to  our  recursive  estimation  problem  is  presented  in  the  next 
section,  where  a new  proof  of  Theorem  1 is  presented  and  the  corresponding 


proof  of  Theorem  2 is  outlined. 


3.  I^-Functionals  of  Gaussian  Processes  and 
Finite  Dimensional  Estimation 

Kallianpur  [7]  has  generalized  the  earlier  results  of  Cameron  and 

Martin  [5]  and  Ito  [6]  on  the  orthogonal  decomposition  of  I^-functionals 

of  a Gaussian  process.  We  will  not  require  all  of  the  isomorphisms 

presented  in  [7];  only  the  following  decomposition  in  terms  of  Hermite 

polynomials  will  be  utilized  here  [8],  Let  x(t),  teS,  be  any  zero-mean 

second  order  Gaussian  process  defined  on  (fi,B,P);  for  our  purposes  S will 

be  either  an  interval  [0,T]  or  the  discrete  time  set  { 0 ; T } . Define  the 

two  Hilbert  spaces  associated  with  x:  the  nonlinear  space  L,(x)  AL,(fl,F  ,P), 

Z = Z 

where  F is  the  o-algebra  generated  by  x(t),  tcS;  and  the  linear  space  H(x), 

the  closed  subspace  of  L^Cx)  spanned  by  x(t) , teS. 

Lemma  1 [7], [8]:  If  {^.yeT}  (F  linearly  ordered)  is  a complete 

orthonormal  set  (CONS)  in  H(x),  then  the  family 

!...p  !)"**  Hr 
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is  a CONS  in  L„(x),  where  H is  the  n normalized  Hermite  polynomial. 

2 n 

That  is,  any  L2_functlonal  6 of  x(*)  has  the  orthogonal  series  expansion 
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Remark:  If  x has  nonzero  mean,  the  representation  of  Lemma  1 can  be 

written  with  respect  to  a centered  CONS,  and  the  coefficients  in  (12)  will 
depend  on  the  mean  of  x. 
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Corollary  1 [6]:  If  x(t),  te[0,T]  is  a standard  Wiener  process,  then 


any  QcL^(x)  has  the  orthogonal  expansion 


6 = I / / P •••  / f (t  ,...,t  )dx(t  ) . . .dx(t  )dx(t  ) 

~ n r\  n P ^ P 1 P~1  P 


P^O  0 0 


A l I (f  ) 
= P > 0 P P 


where  the  integrals  in  (13)  are  iterated  stochastic  integrals;  also, 


Ip(fp)  and  I^(f^)  are  orthogonal  for  all  p^q. 


Now  we  consider  the  estimation  problems  of  Theorems  1 and  2 in  this 


framework.  Assume  throughout  this  section,  for  simplicity  of  notation. 


that  x,y,z^,  and  z ^ are  all  scalars;  the  following  results  also  hold  in  the 


vector  case.  The  state  y(t)  (as  given  by  (2)  or  (4))  is  a nonlinear 


functional  of  x ; assume  that  y(t)  has  a finite  Volterra  series  expansion 


(of  the  form  (7)  or  (8))  of  order  q.  It  is  then  clear  that  y(t)  has  a 


Pi  • • • Pv 

finite  orthogonal  series  expansion  (12)  of  order  q — i.e.,  with  a x =0 

'!• . • i 


for  p>q.  In  the  continuous  case,  the  {t^,}  ate  centered  versions  of 

rt 

functionals  of  the  form  j $ (s)x(s)ds,  while  in  the  discrete  time  the  {£.} 

0 1 I 

are  just  centered  linear  combinations  of  the  x(s),  se{l;t}.  The  estimate 
y ( t | t ) is  a nonlinear  Lj-functional  of  the  Gaussian  innovations  process; 


thus  it  also  has  an  orthogonal  expansion  of  the  form  (12).  In  continuous' 


time  vx(t)  is  a Wiener  process,  so  y(t|t)  has  the  expansion  (13)  with 


x(t)  = R v^(t).  In  discrete  time,  V£(t)  is  an  orthogonal  sequence,  so 


2 . W f 

n(t)  A [C(t)  P(t|t-1)+R]  A^t)  is  a CONS  in  H(v0),  and  the  expansion  (12) 


is  valid  with  = n(i). 


Thus  Theorems  1 and  2 can  be  proved  by  showing  that:  (a)  the 


orthogonal  series  expansion  of  y(t|t)  has  only  a fixed  finite  number  of 
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terms  for  all  t;  and  (b)  such  a finite  orthogonal  series  can  be  realized 
as  the  output  of  a finite  dimensional  recursive  system  (i.e.,  a system 
in  state-space  form).  The  states  of  this  finite  dimensional  system  are 
the  additional  filter  states  referred  to  in  Section  2.  The  following 
theorem  proves  (a)  for  a more  general  formulation;  the  proof  of  (b)  must 
be  done  separately  for  continuous  and  discrete  time,  and  involves  the 
calculation  and  separability  of  the  Volterra  kernels. 

Theorem  3:  Let  x(t) , z(t),  teS  be  zero-mean  jointly  Gaussian  second 
order  processes,  and  assume  that  ycL2(x)  has  an  orthogonal  series 
expansion  of  order  q.  Let  the  orthogonal  expansion  of  y A E[y|FZ]  be 


given  by 


ri’ ' ,ri 

V R Hr  (nft  >•**  Hr  (nft  > 
ei*-'6j  ri  ei  rj  6j 


l l 0 6j  Hr  (nB 

r!°  ri+‘  • '+rj=r  Pl“,Bj  1 B1 

< . . . < B 


where  (n.,6cA, } is  a CONS  in  H(z).  Then 
o 1 

ri*  * ,ri 

bfi  RJ=S°»  r>q; 

Bi***Bj 

that  is,  y also  has  an  orthogonal  expansion  of  order  q. 

Proof : Consider  H(x,z),  the  linear  space  spanned  by  (x(t) ,z(t) ;teS). 

Since  {q.,6eA,}  is  an  orthonormal  set  in  H(x,z),  it  can  be  completed  by 
o 1 

adding  elements  {n(j,6cA2)  in  H(x,z)  to  form  the  CONS  {rig,6eA^UA2}  in 
H(x,z).  The  orthogonal  expansion  for  y can  then  be  rewritten  in  terms  of 
this  CONS  in  H(x,z);  the  new  expansion  is  clearly  also  of  order  q: 


P1  * • -Pi, 

c 1 k H (n  ).*•  H (n  ) 


i i v / 

p=0  p1+...+pk=p  Y1  "Yk 


P1  Y1  pk  Yk 


V — <Yk 


where  {Y^}eA^UA2*  Now  y is  the  orthogonal  projection  of  y onto  L2(z); 
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chat  is,  by  Lemma  1 y is  just  the  projection  of  y onto  the  space  spanned 


by  the  products  Hp^(nY^) . . .Hpk(nYk)  with  y ^ , . . . , y^e  The  orthogonality 

of  such  products  in  (see  Lemma  1)  then  yields 


P,  • • -P. 

I l V . Y Hp  >•••  « (n 

P=0  p^+. ..+pk*p  Y1  ’ Yk  P1  Y1  pk  Y 

Yi<-..<Yk 


H (n  )...  H (n  ) 
pl  Yl  pk  Yk 


where  thus  proving  the  theorem. 


Theorem  3 also  holds  for  nonzero-mean  and  vector-valued  processes, 
with  obvious  modifications  in  the  proof.  This  theorem  then  applies  to 
y ( t | t ) of  Theorems  1 and  2.  It  remains  only  to  prove  that  the  finite 
orthogonal  series  expansion  for  y(tjt)  is  realizable  with  a nonlinear 
recursive  system  of  fixed  finite  dimension.  Consider  first  the  continuous 
time  problem  (l)-(3). 

Proof  of  Theorem  1;  Assume  that  y(t)  has  a finite  Volterra  series 
expansion  of  order  q.  Then  Theorem  3 implies  that  y(tjt)  has  the 
orthogonal  expansion 


CJ  £ s s 

y (t 1 1)  - l / / P...  / 2 f (t,s  ,...,s  )dv(s  )...dv(s  ) (17) 

p=0  0 0 0 P 1 Pi  P 

c 

where  v(t)  A R 2 dv^(t).  The  projection  theorem  and  the  orthogonality  of 
the  iterated  stochastic  integrals  [6]  imply  that,  for  s^<...<Sp<t, 

fp(t’8i 8P>  -ji  a^/.Tar  E[y(t)v(Sl)...v(Sp)]  (18) 


(the  proof  of  (18)  is  analogous  to  that  of  Davis  [14,  p.  95]  for  the  best 


linear  estimate).  A proof  identical  to  that  of  Brockett  [11]  for  the 
deterministic  case  shows  t.iat,  if  the  kernels  (18)  ire  separable  (see  (9)), 
then  y(t|t)  in  (17)  can  be  generated  as  the  output  of  a finite  dimensional 
bilinear  system  driven  by  the  innovations  v(t).  Hence,  Theorem  1 is 
proved  if  the  kernels  in  (18)  are  separable. 

Lemma  2:  The  triangular  kernels  f^(t,s^, . . . ,s^)  given  by  (18)  are 
separable  for  s^ < . . . < < t under  the  hypotheses  of  Theorem  1. 

Proof : Let  y(t)  be  given  by  one  kC^  order  term  in  the  finite  Volterra 
series  (7);  the  proof  generalizes  in  the  obvious  way.  Since  the  kernels 
of  (7)  are  separable  due  to  the  analyticity  assumption  in  (2),  we 
can  assume  that 


t X T. 

y (t)  =/  / ...  / y (x  )...y  (x  )x(t  )...x(x  )dx  ...dx  (19) 

00  0 KKi  X 


Thus,  by  the  Fubini  theorem  (see  [1], 13]) 

t t, 


(t,8l,...s  ) = ~y  / / k...  / 2 Y1(T1)...Yk(xk) 


0 0 


*r r — E[x(x)...x(x  )v(s)...v(s  )]dx  ...dx 

3s  ...  3s  1 k 1 p 1 k 


(20) 


Since  x(x, ) , . . . ,x(x,  ) , v(s, ) , . . . ,v(s  ) are  jointly  Gaussian,  the 
1 k 1 p 

expectation  in  (20)  can  be  expanded  via  Lemma  B.l  of  [1], 

resulting  in  a sum  of  products  of  terms  of  the  form:  E[x(x^)],  E[v(s^)], 

^In  general,  whenever  the  linear  innovations  v(t)  in  a nonlinear  estimation 
problem  form  a Wiener  process,  then  an  (infinite)  orthogonal  expansion  of 
the  form  (17)  will  hold  for  the  estimate  of  each  L2~state  y(t),  and  the 
kernels  are  calculated  via  (18).  The  sum  of  the  first  two  terras  (p«0,l) 
in  (17)  is  the  best  linear  estimate,  the  sum  of  the  terms  for  p-0,1,2 
yields  the  best  quadratic  estimate,  etc.  These  are  not  necessarily 
realizable  with  finite  dimensional  recursive  filters. 
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covlxfr^) ,x(t^) ] , covIxCt^) ,v(s^) ] , and  covfvfs^) , v(Sj ) 1 • Notice  that 

E[v(s^)]=0,  so  all  products  involving  such  terns  are  zero.  If 

cov[v(s^),v(s  )]  arises,  it  results  in  a term  of  the  form 
32  j 

- — - — cov[v(s, ) ,v(s  )],  which  can  be  shown  to  be  zero  for  s.^s.. 
dS^dSj  i j 1 j 

Also,  cov[x(t^) ,x(tj) ] is  the  covariance  function  of  the  state  of  the 
linear  system  (1);  hence,  for 

cov[x(Ti>,x(T^)]  = exp[  / A(o)da]  cov[x(r J ,x(t^) ] 


(21) 


Finally,  consider 


R2  cov[x(t  ),v(s  )]  = cov[x(x .) ,z(s.)  - / ^ C(o)x(o |o)da ] 

1 J 1 J o 

s . 

= cov[x(t  ),  / 3 C(0)(x(a)-x(o|o))do+v(s  )] 

* r\  J 


cov[x(t  ),  / C(a)  (x(a)-x(a[a))do]  (22) 

1 0 


since  x(t.)  and  v(s^)  are  independent.  This  gives  rise  in  (20)  to 


3s, 


cov 


(x(xi) ,v(Sj)  ) = R 2 covfxf  r^) ,C(Sj ) (x(Sj)-x(s^ | s^)) ] 

» C(Sj)R  2 cov[x(’i),x(s^)-x(Sj|sj)1,  (23) 


which  is  the  covariance  function  of  a finite  (two-)  dimensional  linear 
system  with  states  x(t)  and  x(t)-x(tjt),  and  is  thus  also  separable. 

Lemma  B.l  of  [1]  and  the  separability  of  the  relevant  covariance 
functions  imply  that  there  exist  functions  such  that  (20)  can 

be  written  as 


12 


f (t,Si > • • • , s ) 
pi  p 


= 77  / /k---  / 2 y1(t1)...y.(t  ) 

p * 0 0 o 1 1 k 

• ( X W • • •aik<tk)6£l(sl)  • • • %(V)d  V • 

1 m 

= T7  I a»(t)3»  (s  ). ..0»  (s  ),  (2 

P £-1  L 1 p 


and  fp  is  separable  as  claimed;  this  also  completes  the  proof  of  Theorem  1. 

An  example  in  which  the  kernels  and  the  recursive  estimator  are 
computed  explicitly  is  presented  in  the  next  section.  The  discrete  time 
result  which  is  analogous  to  Lemma  2 can  be  used  to  prove  Theorem  2,  but 
for  the  sake  of  brevity  we  will  only  present  an  example  of  the  procedure 
(Section  5). 


4.  A Continuous  Time  Example 

Before  discussing  the  example,  we  present  an  extension  of  Theorem  1 
to  a class  of  systems  in  which  y(t)  contains  process  noise;  the  analogous 
extension  of  Theorem  2 is  proved  in  the  same  manner. 

Theorem  4:  Consider  the  system  (l)-(3),  and  assume  that  (2)  has  a 
finite  Volterra  series  expansion.  Assume  that  there  is  an  additional 
state  y^(t)  satisfying 

dy^(t)  - (F(t)y^(t)  + G(t)y(t))dt  + H(t)dw(t)  (25) 

where  w is  a Wiener  process  and  (x(0) ,y(0) ,y^ (0) ,w(t^) ,v(t^) ,w(t^) } are 
independent  for  all  T^en  y^(t|t)  can  also  be  computed  with  a 

finite  dimensional  recursive  estimator. 

Proof : The  solution  of  (25)  is 
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t t 

y.(t)  - $(t,0)y  (0)  + / $(t,s)G(s)y(s)ds  + / 4>(t  ,s)H(s)dw(s) 

A A A r\ 


A y (t)  + / 4»(t,s)H(s)dw(s)  (26) 

0 

where  $ is  the  state  transition  matrix  for  F.  Since  w(*)  and  z(*)  are 
independent,  y 1 ( 1 1 1)  A E[y1<t)  | ] = E [y ^ (t)  | F*  ] , and  y^(t)  is  just 
described  by  a finite  Volterra  series  expansion  in  x.  The  theorem  then 
follows  from  Theorem  1. 

Example  1:  Consider  the  scalar  system 

(27) 


dx(t)  “ -ax(t)dt  + dw^(t) 


dy(t)  = (-yy(t)  + x (t))dt  + dw2(t) 
dz(t)  « x(t)dt  + dv(t) 


(28) 

(29) 


with  the  same  assumptions  as  in  Theorem  A.  The  solution  of  (28)  is 

y(t)  = e Yt  y(0)  + J e Y^C  x^(o)do  + / e Y^1  dw~(o)  (30) 


By  Theorems  3 and  4,  it  follow  that 

t 


t s. 


y (t | t)  = fn(t)  + / f (t,s)dv(s)  + / / f.(t,s  ,s  )dv(s  )dv(s  ), 

0 0 00 

(31) 


where  v(t)  - z(t)  - /*  x(s|s)ds.  Using  (18)  to  compute  the  kernels  as  in 
Lemma  2,  we  have  (since  y(0)  and  w are  independent  of  v) 

fn(t)  ■=  E[y (t)  ] - e"Yt  E[y (0)  ] + /'  e'Y<t_o)  Elx2(o)]do  (32) 

0 0 


f — 


fl(t,S>  “ ~h  v(s^ 1 “ / e Y^C  E[x2(a)v(s) ]da 


t . . s 

2 / e Y t ° m(a)  ~ E[x(o)  / (x(t)-x(t  |x))dT]do 
0 38  0 


* 2 / e Y^C  rn(a)  E[x(a) (x(s)-x(s | s) ) ]do  (33) 
0 


where  m(a)  - E[x(o)]  - e 00  E[x(0)].  It  can  be  shown,  using  Lemma  2.2  of 
[1],  that 


E[x(o)(x(s)-x(s|s))] 


e P(s) , a >_s 


I K(s,a) 


(34) 


P(s),  o < s 


s , 

where  K(s,o)  ■ exp[a(s-a)-/  P (i)dT]  and  P(t)  is  the  Kalman  filter 


error  covariance  for  x(t).  Thus 


f1(t,s)  ■ 2 e Y^C  j.  ( .-V(t-c) 


m(o)K(s,o)dc + / e ’ 1 1 m(o)  e a^°  dol P(s) 
s J 

(35) 


Similarly, 


f2(t.s1,s2)  [ /Q  1 

-Y(t-o) 

e 

8- 

-y(t-o) 

+ / 2 

e 

81 

t 

-Y(t-O) 

+ / 

e 

S2 

that  0 < s1  < s2  < t) . 

K(s2,o)do 


-a(o-s, ) -a(o-s-) 


’]  p(-i> 


do  I P(s,)P(s2) 


(36) 


These  kernels  are  obviously  separable,  so  y(t|t)  can  be  realized  as 
the  output  of  a finite  dimensional  bilinear  system  driven  by  the 
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innovations.  However,  it  may  not  be  efficient  to  realize  each  term  in 
(31)  individually.  In  fact,  one  efficient  recursive  realization  of  y(t|t) 
is  readily  derived  via  the  procedure  of  [ 1,  Example  2 . 1 ] ; a recursive  3-state 
filter  which  computes  both  x(t|t)  and  y(t|t)  is  constructed  as  follows. 
First,  augment  the  state  x(t)  of  (27)  with  the  additional  state  £(t) 
given  by 

C(t)  - (a- Y-P-1(t)K(t)  + x(t);  E(0)  = 0 (37) 

Then  the  Kalman-Bucy  2-state  filter  for  the  linear  system  (27),  (37)  with 
observations  (29)  recursively  computes  x(t|t)  and  £(t|t).  Finally,  y(t|t) 
is  computed  by 

dy(t|t)  - (- y y ( t | t)  + [x(t | t)]2 +P(t))dt + 2P(t)£(t|t)dv(t) 
y(0|0)  - 0 (38) 

To  check  that  this  filter  has  the  series  expansion  (31) , (32) , (35) , (36) 
is  straightforward. 

It  should  also  be  noted  that  if  x(t)  has  zero  mean,  then  the  best 
linear  estimate  of  y(t)  given  zt  (the  first  two  terms  in  (31))  is  equal  to 
the  a priori  mean  of  y(t).  This  is  due  to  the  fact  that,  in  this  case,  y 
and  z are  uncorrelated.  However,  since  y and  z are  not  independent,  the 
best  quadratic  estimator  (which  is  equal  to  the  conditional  mean  in  this 
example)  can  in  fact  offer  significant  improvement  in  estimator 
performance  (see  [18]  for  some  case  studies  and  further  analysis  along 
these  lines). 


5.  A Discrete  Time  Example 
Example  2;  Consider  the  scalar  discrete  time  system 
x(t+l)  - ax(t)  + w^(t) 
y(t+l)  - yy(t)  + x2(t)  + w?(t) 


(39) 

(40) 


-*C 
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z(t)  = X ( t ) + v(t) 


(41) 


These  kernels  can,  as  in  Example  1,  be  explicitly  evaluated.  They 
are  indeed  separable,  and  a 3-state  filter  can  be  constructed  as  follows 
using  the  methods  of  [1],[2].  First,  augment  the  state  x(t)  of  (39)  with 

«'+l> ,(t)i  t(0)-°  <«> 


Then  x(t|t)  and  £(t|t)  can  be  calculated  by  a 2-state  Kalman  filter. 
Finally, 


< 


I 


y(t+l|t+l)  = ay(t|t)  + x(t|t)2  + 6(t) 

+ 2M(t,t+l)  [x(t  1 1)  + y£(t  1 1)  Mt+1) 

+ r l Yt_1  H(1 , t+1) 2 1 V(t+1)2 

Li-0  J 

y (0 | 0)  - 0 


(46) 


M(i,t+1) 


at~1+1  P(ili)  , . ,P(t+ll  t+ll 
P(i+l|i). . .P(t+l|t) 


(47) 


and  6(t)  are  deterministic  functions,  and  v(t)  - z (t)-ax(t-l | t-1)  is 
the  unnormalized  innovations  process. 

Notice  that  the  recursive  optimal  estimator  (46)  contains  a final 
term  which  is  quadratic  in  the  innovations.  In  general,  if  y(t)  contains 
a Volterra  series  of  order  q in  x(t),  then  the  recursive  estimator  for 
y ( t | t ) will  contain  polynomials  of  degree  q in  v(t).  This  result  was 
proved  in  [2],  but  it  also  follows  naturally  from  the  orthogonal  series 
decomposition  (16)  of  y(t|t)  — if  y(t)  has  a Volterra  series  of  order  q, 
(16)  will  contain  terms  such  as  H (n(t)),  or  polynomials  of  order  q in 

q 

n(t) . This  phenomenon  does  not  occur  in  continuous  time  estimation 
problems  with  observations  corrupted  by  "Gaussian  white  noise",  in  which 
the  optimal  recursive  estimator  is  always  linear  in  the  innovations. 

In  [2],  this  contrast  is  explained  by  means  of  the  different 
martingale  representation  theorems  in  continuous  and  discrete  time  [ 19 J , 
[20].  However,  a simple  explanation  is  provided  by  the  representation 
(16).  In  continuous  time,  the  elements  of  the  CONS  in  L^z1)  are  of 
the  form  /*  <f>^(s)dv^(s) , and  the  series  (16)  can  be  expressed  in  terms  of 
Iterated  stochastic  integrals  as  in  (17).  Given  separability,  the  series 
can  then  be  realized  with  a finite  dimensional  bilinear  system  — that  is, 
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1 


\ 


I 


I 


1 


the  stochastic  differential  equations  in  the  realization  are  linear  in 

dv,(t).  In  the  discrete  time  case,  the  elements  n of  the  CONS  in  L„(zC) 

1 y 2 

are  given  by  the  normalized  discrete  time  innovations  n(t);  the  series  (16) 
then  gives  rise  to  a finite  Volterra  series  in  the  innovations  Vj(t)  which 
contains  polynomials  in  VjCt).  Given  the  appropriate  realizability 
conditions,  this  series  can  be  realized  by  a finite  dimensional  state- 
affine  system  [2], [21]  — that  is,  the  recursive  equations  in  the 
realization  contain  polynomials  in  V2(t).  Hence  state-affine  equations 
containing  polynomials  in  V2<t)  arise  in  a very  natural  way  as  realizations 
of  the  finite  series  expansion  of  y(t|t). 
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